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Abstract. Monotone systems of polynomial equations (MSPEs) are systems of fixed- 
point equations X\ = /i (X\ , . . . , X n ) , . . . , X n — f n (X\ , . . . , X n ) where each fi is a poly- 
nomial with positive real coefficients. The question of computing the least non-negative 
solution of a given MSPE X — f{X) arises naturally in the analysis of stochastic models 
such as stochastic context-free grammars, probabilistic pushdown automata, and back- 
button processes. Etessami and Yannakakis have recently adapted Newton's iterative 
method to MSPEs. In a previous paper we have proved the existence of a threshold kf 
for strongly connected MSPEs, such that after kf iterations of Newton's method each 
new iteration computes at least 1 new bit of the solution. However, the proof was purely 
existential. In this paper we give an upper bound for kf as a function of the minimal com- 
ponent of the least fixed-point fif of f(X). Using this result we show that kf is at most 
single exponential resp. linear for strongly connected MSPEs derived from probabilistic 
pushdown automata resp. from back-button processes. Further, we prove the existence of 
a threshold for arbitrary MSPEs after which each new iteration computes at least 1 /w2 h 
new bits of the solution, where w and h are the width and height of the DAG of strongly 
connected components. 



1. Introduction 

A monotone system of polynomial equations (MSPE for short) has the form 

X\ = fi(Xi, . . . , X n ) 

X n = f n (Xi, . . . , X n ) 

where fx, . . . , f n are polynomials with positive real coefficients. In vector form we denote an 
MSPE by X = f(X). We call MSPEs "monotone" because x < x' implies f(x) < f(x') 
for every x, x' G K>n- MSPEs appear naturally in the analysis of many stochastic models, 
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such as context-free grammars (with numerous applications to natural language processing 
[T9l [To] , and computational biology [2TJ HI 13 d7J ) , probabilistic programs with procedures 
B [13 S El 13 EE], and web-surfing models with back buttons Hi]. 

By Kleene's theorem, a feasible MSPE X = f(X) (i.e., an MSPE with at least one 
solution) has a least solution fif; this solution can be irrational and non-expressible by 
radicals. Given an MSPE and a vector v encoded in binary, the problem whether fif < v 
holds is in PSPACE and at least as hard as the SQUARE-ROOT-SUM problem, a well- 
known problem of computational geometry (see |10[ [T2] for more details) . 

For the applications mentioned above the most important question is the efficient nu- 
merical approximation of the least solution. Finding the least solution of a feasible system 
X = f(X) amounts to finding the least solution of F(X) = for F(X) = f(X) - X. 
For this we can apply (the multivariate version of) Newton's method [20J: starting at some 
a;(o) G M n ( we 

use uppercase to denote variables and lowercase to denote values), compute 

the sequence 

x (k+l) ._ x (k) _ (i^W))-!^*)) 

where F'(X) is the Jacobian matrix of partial derivatives. 

While in general the method may not even be defined (F'(x^) may be singular for 
some k), Etessami and Yannakakis proved in |10[ [T2] that this is not the case for the 
Decomposed Newton's Method (DNM), that decomposes the MSPE into strongly connected 
components (SCCs) and applies Newton's method to them in a bottom-up fashion^. 

The results of |10[ [T2] provide no information on the number of iterations needed to 
compute i valid bits of i.e., to compute a vector u such that \^fj — Vj \ / \nfj\ < 
for every 1 < j < n. In a former paper [16] we have obtained a first positive result on this 
problem. We have proved that for every strongly connected MSPE X = f(X) there exists 
a threshold kf such that for every i > the (kf + i)-th iteration of Newton's method has 
at least i valid bits of fif. So, loosely speaking, after kf iterations DNM is guaranteed to 
compute at least 1 new bit of the solution per iteration; we say that DNM converges linearly 
with rate 1. 

The problem with this result is that its proof provides no information on kf other than 
its existence. In this paper we show that the threshold kf can be chosen as 

kf = 3n 2 m + 2n 2 |log/U m i n | 

where n is the number of equations of the MSPE, m is such that all coefficients of the 
MSPE can be given as ratios of m-bit integers, and // m i n is the minimal component of the 
least solution fif. 

It can be objected that kf depends on /x/, which is precisely what Newton's method 
should compute. However, for MSPEs coming from stochastic models, such as the ones 
listed above, we can do far better. The following observations and results help to deal with 

• We obtain a syntactic bound on fi m i n for probabilistic programs with procedures 
(having stochastic context-free grammars and back-button stochastic processes as 
special instances) and prove that in this case kf < n2 n+2 m. 



A subset of variables and their associated equations form an SCC, if the value of any variable in the 
subset influences the value of all variables in the subset, see Section [2] for details. 
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• We show that if every procedure has a non-zero probability of terminating, then 
kf < 3nm. This condition always holds in the special case of back-button pro- 
cesses [131 El- Hence, our result shows that i valid bits can be computed in time 
0((nm + i) ■ n 3 ) in the unit cost model of Blum, Shub and Smale p], where each 
single arithmetic operation over the reals can be carried out exactly and in constant 
time. It was proved in [131 [T3] by a reduction to a semidefinite programming prob- 
lem that i valid bits can be computed in poly(i, n, m)-time in the classical (Turing- 
machine based) computation model. We do not improve this result, because we 
do not have a proof that round-off errors (which are inevitable on Turing-machine 
based models) do not crucially affect the convergence of Newton's method. But our 
result sheds light on the convergence of a practical method to compute [if . 

• Finally, since < x^ k+l ^ < fif holds for every k > 0, as Newton's method 
proceeds it provides better and better lower bounds for /x m j n and thus for kf. In the 
paper we exhibit a MSPE for which, using this fact and our theorem, we can prove 
that no component of the solution reaches the value 1. This cannot be proved by 
just computing more iterations, no matter how many. 

The paper contains two further results concerning non-strongly-connected MSPEs: Firstly, 
we show that DNM still converges linearly even if the MSPE has more than one SCC, albeit 
the convergence rate is poorer. Secondly, we prove that Newton's method is well-defined 
also for non-strongly-connected MSPEs. Thus, it is not necessary to decompose an MSPE 
into its SCCs - although decomposing the MSPE may be preferred for efficiency reasons. 

The paper is structured as follows. In Section [2] we state preliminaries and give some 
background on Newton's method applied to MSPEs. Sections [5l and [6] contain the three 
results of the paper. Section [4] contains applications of our main result. We conclude in 
Section [71 Missing proofs can be found in a technical report [5]. 

2. Preliminaries 

In this section we introduce our notation and formalize the concepts mentioned in the 
introduction. 

2.1. Notation 

M and N denote the sets of real, respectively natural numbers. We assume € N. M n 
denotes the set of n-dimensional real valued column vectors and M> the subset of vectors 
with non-negative components. We use bold letters for vectors, e.g. x £ W 1 , where we 
assume that x has the components x\, . . . ,x n . Similarly, the i th component of a function 
/ : R n -» R n is denoted by 

jjroxn dgnotes the set of matrices having m rows and n columns. The transpose of a 
vector or matrix is indicated by the superscript T . The identity matrix of M. nxn is denoted 
by Id. 

The formal Neumann series of A £ R nxn is defined by A* = J2keN Ak - lt is well-known 
that A* exists if and only if the spectral radius of A is less than 1, i.e. max{|A| | C 3 
A is an eigenvalue of ^4} < 1. If A* exists, we have A* = (Id — A)^ 1 . 

The partial order < on W 1 is defined as usual by setting x < y if Xi < yt for all 
1 < i < n. By x < y we mean x < y and x ^ y. Finally, we write x -< y if Xi < yi in every 
component. 



292 



J. ESPARZA, S. KIEFER, AND M. LUTTENBERGER 



We use Xi, . . . , X n as variable identifiers and arrange them into the vector X. In the 
following n always denotes the number of variables, i.e. the dimension of X. While x,y, . . . 
denote arbitrary elements in IR n , resp. M" , we write X if we want to emphasize that a 
function is given w.r.t. these variables. Hence, f(X) represents the function itself, whereas 
f(x) denotes its value for x G M n . 

If Y is a set of variables and x a vector, then by xy we mean the vector obtained by 
restricting x to the components in Y . 

The Jacobian of a differentiate function f(X) with / : 1" — > M m is the matrix f'{X) 
given by 



f'{X) 



/ Mi g/i 

/ dXx ■ ■ ■ dX n 



dfm dfrr 

.axi ••• dx n . 



2.2. Monotone Systems of Polynomials 

Definition 2.1. A function f(X) with / : M™ — > M" is a monotone system of polyno- 
mials (MSP), if every component fi(X) is a polynomial in the variables X\, . . . ,X n with 
coefficients in M>o- We call an MSP f(X) feasible if y = f(y) for some y G K>g- 

Fact 2.2. Every MSP f is monotone on M™ , i.e. for < x < y we have f(x) < f(y). 

Since every MSP is continuous, Kleene's fixed-point theorem (see e.g. [18]) applies. 

Theorem 2.3 (Kleene's fixed-point theorem). Every feasible MSP f(X) has a least fixed 
point fif in i.e., \if = f(fJ-f) and, in addition, y = f(y) implies fif < y. Moreover, 

the sequence (i^^)keN with := ; and *tf +l ^ '■= /( K ^) = / fc+1 (0) is monotonically 

increasing with respect to < (i.e. Kjf^ < k^ +1 ^) and converges to [if. 

In the following we call (Kf)keH t ne Kleene sequence of f(X), and drop the subscript 
whenever / is clear from the context. Similarly, we sometimes write fi instead of [if . 

A variable X% of an MSP f(X) is productive if > for some k G N. An MSP is 

(n) (k) 

clean if all its variables are productive. It is easy to see that k\ = implies k\ = for 
all k G N. As for context-free grammars we can determine all productive variables in time 
linear in the size of /. 

Notation 2.4. In the following, we always assume that an MSP / is clean and feasible. 
I.e., whenever we write "MSP", we mean "clean and feasible MSP", unless explicitly stated 
otherwise. 

For the formal definition of the Decomposed Newton's Method (DNM) (see also Section [1]) 
we need the notion of dependence between variables. 

Definition 2.5. Let f(X) be an MSP. Xi depends directly on X^, denoted by Xi < X^, 
if -q^-(X) is not the zero-polynomial. Xi depends on X^ if Xi <!* Xf., where <!* is the 
reflexive transitive closure of <L An MSP is strongly connected (short: an scMSP) if all its 
variables depend on each other. 
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Any MSP can be decomposed into strongly connected components (SCCs), where an SCC 
S is a maximal set of variables such that each variable in S depends on each other variable 
in S. The following result for strongly connected MSPs was proved in [10\ 112] : 

Theorem 2.6. Let f(X) be an scMSP and define the Newton operator Aff as follows 

M f (x) = x + (id - f'(x)rHf(x) - x) . 

We have: (1) Mf{x) is defined for all < x -< fif (i.e., (Id — /'(a;)) -1 exists). Moreover, 
f'(xf = J2kemf'( x ) k exists for all < x ~< [if, andsoM f {X) = X + f (X)* (f (X) -X). 
(2) The Newton sequence (v^p)keN with = A/"j?(0) is monotonically increasing, bounded 
from above by fif (i.e. < f(v^) < i/( fc+1 ) -< fif), and converges to /if. 

DNM works by substituting the variables of lower SCCs by corresponding Newton approx- 
imations that were obtained earlier. 



3. A Threshold for scMSPs 

In this section we obtain a threshold after which DNM is guaranteed to converge linearly 
with rate 1. 

We showed in [16] that for worst-case results on the convergence of Newton's method it is 
enough to consider quadratic MSPs, i.e., MSPs whose monomials have degree at most 2. The 
reason is that any MSP (resp. scMSP) / can be transformed into a quadratic MSP (resp. 
scMSP) / by introducing auxiliary variables. This transformation is very similar to the 
transformation of a context-free grammar into Chomsky normal form. The transformation 
does not accelerate DNM, i.e., DNM on / is at least as fast (in a formal sense) as DNM on 
/, and so for a worst-case analysis, it suffices to consider quadratic systems. We refer the 
reader to [16J for details. 

We start by defining the notion of "valid bits" . 

Definition 3.1. Let f(X) be an MSP. A vector v has i valid bits of the least fixed point 
V-f if \fifj - v j\l \v>Sj\ < 2 ~ l f° r every 1 < j < n. 

In the rest of the section we prove the following: 

Theorem 3.2. Let f(X) be a quadratic scMSP. Let c m i n be the smallest nonzero coefficient 
of f and let \x m i n and fi ma x be the minimal and maximal component of [if , respectively. Let 

, i Umax 

kf = n ■ log 



^C-min ' l^rnin ' rnin.\Mmiri) 1} 

Then i/(r fc /l +J ) has i valid bits of [if for every i > 0. 

Loosely speaking, the theorem states that after kf iterations of Newton's method, every 
subsequent iteration guarantees at least one more valid bit. It may be objected that kf 
depends on the least fixed point /if, which is precisely what Newton's method should 
compute. However, in the next section we show that there are important classes of MSPs 
(in fact, those which motivated our investigation), for which bounds on fi m i n can be easily 
obtained. 

The following corollary is weaker than Theorem 13.21 but less technical in that it avoids 
a dependence on /i max and c m j n . 
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Corollary 3.3. Let f(X) be a quadratic scMSP of dimension n whose coefficients are 
given as ratios of m-bit integers. Let [i m i n be the minimal component of [if. Let kf = 
3n 2 m + 2n 2 |log/x m j n | . Then has at least i valid bits of [if for every i > 0. 



Corollary 13.31 follows from Theorem 13.21 by a suitable bound on ^ max in terms of c m i n and 
/•imin [5j (notice that, since c m \ n is the quotient of two m-bit integers, we have c m \ a > l/2 m ). 

In the rest of the section we sketch the proof of Theorem 13.21 The proof makes crucial 
use of vectors d y such that d > f'([if)d. We call a vector satisfying these two conditions 
a cone vector of f or, when / is clear from the context, just a cone vector. 

In a previous paper we have shown that if the matrix (Id — f'(fif)) is singular, then / 
has a cone vector ([16], Lemmata 4 and 8). As a first step towards the proof of Theorem 13.21 
we show the following stronger proposition. 

Proposition 3.4. Any scMSP has a cone vector. 

To a cone vector d = (d±, . . . , d n ) we associate two parameters, namely the maximum and 
the minimum of the ratios [if^/di, [if '2/ 'cfej • • • , [if n /d n , which we denote by A max and A m i n , 
respectively. The second step consists of showing (Proposition 13. 6p that given a cone vector 
d, the threshold kf ^ = log(A max /A m i n ) satisfies the same property as kf in Theorem 13.21 
i.e., z^f r fc y,rfl +*) 1 valid bits of [if for every i > 0. This follows rather easily from the 
following fundamental property of cone vectors: a cone vector leads to an upper bound on 
the error of Newton's method. 



Lemma 3.5. Let d be a cone vector of an MSP f and let X max = max{^}. Th 



en 



d. 



Proof Idea. Consider the ray g(t) = [if — td starting in [if and headed in the direction — d 
(the dashed line in the picture below). It is easy to see that g(A max ) is the intersection of g 
with an axis which is located farthest from [if. One can then prove g(^A max ) < i^ 1 ), where 
fK^Amax) is the point of the ray equidistant from g(A max ) and [if. By repeated application 
of this argument one obtains fir(2 _fe A max ) < for all k £ N. 

The following picture shows the Newton iterates i/W for < k < 2 (shape: x) and 
the corresponding points fir(2 -A: A max ) (shape: +) located on the ray g. Notice that > 
g(2- fc A max ). ■ 



= 9(0) 
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Now we easily obtain: 

Proposition 3.6. Let f(X) be an scMSP and let d be a cone vector of f. Let kf ^ = 

log where A max = max, and X m i n = min 7 - Then i/(r fc /,dl+«) nas a f l eas i { valid 
bits of fif for every i > 0. 

We now proceed to the third and final step. We have the problem that kfd depends on the 
cone vector d, about which we only know that it exists (Proposition 13.41) . We now sketch 
how to obtain the threshold kf claimed in Theorem 13. 2\ which is independent of any cone 
vectors. 

Consider Proposition 13.61 and let A max = ^j 1 and A m i n = ^j 2 -. We have kf d = 

log (^j- ■ ^j 1 ^- The idea is to bound kf ^ in terms of c m ; n . We show that if kf ^ is very 

large, then there must be variables X, Y such that X depends on Y only via a monomial 
that has a very small coefficient, which implies that c m i n is very small. 



4. Stochastic Models 

As mentioned in the introduction, several problems concerning stochastic models can be 
reduced to problems about the least solution fif of an MSPE /. In these cases, fif is a 
vector of probabilities, and so /x max < 1. Moreover, we can obtain information on /i m i n , 
which leads to bounds on the threshold kf. 



4.1. Probabilistic Pushdown Automata 

Our study of MSPs was initially motivated by the verification of probabilistic pushdown 
automata. A probabilistic pushdown automaton (pPDA ) is a tuple V = {Q, T, 5, Prob) where 
Q is a finite set of control states, T is a finite stack alphabet, 5 Q Q x T x Q x T* is & finite 
transition relation (we write pX <^-> qa instead of (p,X,q,a) € 5), and Prob is a function 
which to each transition pX qa assigns its probability Prob(pX > qa) € (0, 1] so that 

for all p G Q and X € T we have Yl p x^qa Prob{pX ■— > qa) = 1. We write pX ^> qa 
instead of Prob{pX qa) = x. A configuration of V is a pair qw, where q is a control state 
and w € T* is a stack content. A probabilistic pushdown automaton V naturally induces 
a possibly infinite Markov chain with the configurations as states and transitions given by: 
pXP ^> qa(5 for every /? S T* iff pX ^> qa. We assume w.l.o.g. that if pX «— > qa is a 
transition then \a\ < 2. 

pPDAs and the equivalent model of recursive Markov chains have been very thoroughly 
studied [H [TTJ], [HJ [71 [TJ] . These papers have shown that the key to the analysis of pPDAs 
are the termination probabilities [pXq], where p and q are states, and X is a stack letter, 
defined as follows (see e.g. [B] for a more formal definition): [pAg] is the probability that, 
starting at the configuration pX, the pPDA eventually reaches the configuration qe (empty 
stack). It is not difficult to show that the vector of termination probabilities is the least 
fixed point of the MSPE containing the equation 

\pxq\ = x ■ ^2i rYt ] • i tZ( i\ + x ■ t ry<? ] + X) x 

pX^rYZ v pX^rY pX^qe 

for each triple {p, X, q). Call this quadratic MSPE the termination MSPE of the pPDA 
(we assume that termination MSPEs are clean, and it is easy to see that they are always 
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feasible). We immediately have that if X = f(X) is a termination MSP, then // max < 1- 
We also obtain a lower bound on ^ ra \ n : 



Lemma 4.1. Let X = f(X) be a termination MSPE with n variables. Then /U m j n > 

(2«+l-l) 



Together with Theorem 13.21 we get the following exponential bound for kf. 

Proposition 4.2. Let f be a strongly connected termination MSP with n variables and 
whose coefficients are expressed as ratios of m-bit numbers. Then kf < n2 n+2 m. 

We conjecture that there is a lower bound on kf which is exponential in n for the following 
reason. We know a family (/'• ra ^)n=i,3,5,... of strongly connected MSPs with n variables and 
irrational coefficients such that c^ n = \ for all n and /zj^ is double-exponentially small in 
n. Experiments suggest that 6(2™ ) iterations are needed for the first bit of nf^ n \ but we 
do not have a proof. 

4.2. Strict pPDAs and Back-Button Processes 

A pPDA is strict if for all pX E Q x T and all q 6 Q the transition relation contains a 
pop-rule pX qe for some x > 0. Essentially, strict pPDAs model programs in which every 
procedure has at least one terminating execution that does not call any other procedure. 
The termination MSP of a strict pPDA is of the form b(X , X) + IX + c for c >- 0. So we 
have /i/ > c, which implies // m i n > c m \ n . Together with Theorem 13.21 we get: 

Proposition 4.3. Let f be a strongly connected termination MSP with n variables and 
whose coefficients are expressed as ratios of m-bit numbers. If f is derived from a strict 
pPDA, then kf < 3nm. 

Since in most applications m is small, we obtain an excellent convergence threshold. 

In [131 [14] Fagin et al. introduce a special class of strict pPDAs called back-button 
processes: in a back-button process there is only one control state p , and any rule is of the 

form pA c — ► pe or pA c > pBA. So the stack corresponds to a path through a finite graph 

with r as set of nodes and edges A — > B for pA ' AB > pBA. 

In [13(. [T4] back-button processes are used to model the behaviour of web-surfers: V is 
the set of web-pages, Iab is the probability that a web-surfer uses a link from page A to page 
B, and bA is the probability that the surfer pushes the "back" -button of the web-browser 
while visiting A. Thus, the termination probability \pAp] is simply the probability that, if A 
is on top of the stack, A is eventually popped from the stack. The termination probabilities 
are the least solution of the MSPE consisting of the equations 



mm 



[pAp] 



b A + Yl lAB\pBp]\pAp] 



b A + \pAp] IabIpBp}. 



l AB 

pA c >pBA 



l AB 

pA Q >pBA 
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4.3. An Example 

As an example of application of Theorem 13,21 consider the following scMSPE X = f(X). 

'xA I §AX 2 X X + 0.6 
X 2 = 0.3X^2 + O.4X3X2 + 0.3 
K X 3 J \ O.3X1X3 + 0.7 

The least solution of the system gives the revocation probabilities of a back-button process 
with three web-pages. For instance, if the surfer is at page 2 it can choose between following 
links to pages 1 and 3 with probabilities 0.3 and 0.4, respectively, or pressing the back button 
with probability 0.3. 

We wish to know if any of the revocation probabilities is equal to 1. Performing 14 New- 
ton steps (e.g. with Maple) yields an approximation i/( 14 ) to the termination probabilities 
with 



/ 0.98 ^ 




( 0.99 \ 


0.97 


| < ^ {14) < | 


0.98 


\O.992y 




\, 0.993/ 



We have c m i n = 0.3. In addition, since Newton's method converges to [if from below, 
we know [i m \ a > 0.97. Moreover, /i max < 1, as 1 = /(l) and so [if < 1. Hence kf < 
^ ' 97-oVo 97 — ^' Theorem 13.21 then implies that has (at least) 8 valid bits of [if . 
As [if < 1, the absolute errors are bounded by the relative errors, and since 2 -8 < 0.004 
we know: 

T 







f0.994\ 






M 


0.984 




2- 8 ) 




V 0.997 V 





[if ~< + I 2 8 ) < I 0.984 ) H I 1 
So Theorem 13.21 gives a proof that all 3 revocation probabilities are strictly smaller than 1. 



5. Linear Convergence of the Decomposed Newton's Method 

Given a strongly connected MSP /, Theorem 13.21 states that, if we have computed kf 
preparatory iterations of Newton's method, then after i additional iterations we can be sure 
to have computed at least i bits of [if. We call this linear convergence with rate 1. Now we 
show that DNM, which handles non-strongly-connected MSPs, converges linearly as well. 
We also give an explicit convergence rate. 

Let f(X) be any quadratic MSP (again we assume quadratic MSPs throughout this 
section), and let h(f) denote the height of the DAG of strongly connected components 
(SCCs). The convergence rate of DNM crucially depends on this height: In the worst 
case one needs asymptotically Q(2 h ^') iterations in each component per bit, assuming one 
performs the same number of iterations in each component. 

To get a sharper result, we suggest to perform a different number of iterations in each 
SCC, depending on its depth. The depth of an SCC S is the length of the longest path in 
the DAG of SCCs from S to a top SCC. 

In addition, we use the following notation. For a depth t, we denote by comp(t) the 
set of SCCs of depth t. Furthermore we define C{t) := |J comp(t) and C > (t) := \J t i >t C(t') 
and, analogously, C < (t). We will sometimes write v t for i>c(t) an d v >t for i>c>(t) an d v <t 
for v c< (£), where v is any vector. 

Figure [T] shows the Decomposed Newton's Method (DNM) for computing an approx- 
imation v for [if, where f(X) is any quadratic MSP. The authors of [10] recommend to 
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run Newton's Method in each SCC S until "approximate solutions for S are considered 
'good enough' ". Here we suggest to run Newton's Method in each SCC S for a number of 
steps that depends (exponentially) on the depth of S and (linearly) on a parameter j that 
controls the number of iterations (see Figure [1]) . 



function DNM (f,j) 

/* The parameter j controls the number of iterations. * / 
for t from h(f) downto 

forall S € comp(t) /* all SCCs S of depth t */ 

u s ■= AT 3 f 2 \o) /* j ■ 2* iterations */ 

/* apply us in the depending SCCs */ 

f <t (X) :=f <t {X)[X s /u s ] 
return v 

Figure 1: Decomposed Newton's Method (DNM) for computing an approximation v of \xf 



Recall that h(f) was defined as the height of the DAG of SCCs. Similarly we define the 
width w(f) to be max t \comp(t)\. Notice that / has at most (h(f) + 1) • w(f) SCCs. We 
have the following bound on the number of iterations run by DNM. 

Proposition 5.1. The function DNM(/ , j) of Fig.Ulruns at most j-w{f)-2 h<y ^ +1 iterations 
of Newton's method. 

We will now analyze the convergence behavior of DNM asymptotically (for large j). Let 

A^p denote the error in S when running DNM with parameter j, i.e., Ag := fi s — Vg\ 

(i) 

Observe that the error AJ can be understood as the sum of two errors: 

a? = = Oh - Jh U) ) + ®P - v?) , 

where Ji t ^ := fi(f t (X)[X >t /v^l]) , i.e., JT t ^ is the least fixed point of f t after the ap- 

(i) 

proximations from the lower SCCs have been applied. So, AJ consists of the propagation 

error (/x t — Jif^) and the newly inflicted approximation error (JT^ — v^)- 

The following lemma, technically non-trivial to prove, gives a bound on the propagation 
error. 

Lemma 5.2 (Propagation error). There is a constant c > such that 



\\Ht - y-t\\ ^ °- y\\v>t - v>t\\ 

holds for all v>t with < v>t < n >t , where JT t = pb(f t ( K X)\X > t/v > t\\ . 

Intuitively, Lemma 15.21 states that if Vyt has k valid bits of n >t , then JT t has roughly k/2 
valid bits of n t . In other words, (at most) one half of the valid bits are lost on each level of 
the DAG due to the propagation error. 

The following theorem assures that after combining the propagation error and the 
approximation error, DNM still converges linearly. 

Theorem 5.3. Let f be a quadratic MSP. Let denote the result of calling DNM(/,j) 
(see FigureUty- Then there is a kf G N such that has at least i valid bits of [if for 

every i > 0. 
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We conclude that increasing i by one gives us asymptotically at least one additional bit in 
each component and, by Proposition 15,11 costs w(f) ■ 2 h ^ +l additional Newton iterations. 

In the technical report [5] we give an example that shows that the bound above is 
essentially optimal in the sense that an exponential (in h(f)) number of iterations is in 
general needed to obtain an additional bit. 

6. Newton's Method for General MSPs 

Etessami and Yannakakis [10] introduced DNM because they could show that the matrix 
inverses used by Newton's method exist if Newton's method is run on each SCC separately 
(see Theorem I2.6p . 

It may be surprising that the matrix inverses used by Newton's method exist even if 
the MSP is not decomposed. More precisely one can show the following theorem, see [5]. 

Theorem 6.1. Let f(X) be any MSP, not necessarily strongly connected. Let the Newton 
operator Mf be defined as before: 

M f {X) = X + (Id - f(X))-\f(X) - X) 

Then the Newton sequence (u^)k^n with = A/"j?(0) is well-defined (i.e., the matrix 
inverses exist), monotonically increasing, bounded from above by [if (i.e. i/W < -< 
[if), and converges to [if . 

By exploiting Theorem 15.31 and Theorem 16.11 one can show the following theorem which 
addresses the convergence speed of Newton's Method in general. 

Theorem 6.2. Let f be any quadratic MSP. Then the Newton sequence (v^^kem is 
well-defined and converges linearly to [if. More precisely, there is a kf G N such that 
u (k f +i-(h(f)+i)-i h{ J')) ^ flS a £ [ eas f i valid bits of [if for every i > 0. 

Again, the 2 h ^ factor cannot be avoided in general as shown by an example in [5]. 

7. Conclusions 

We have proved a threshold kf for strongly connected MSPEs. After kf -\-i Newton iterations 
we have i bits of accuracy. The threshold kf depends on the representation size of / and 
on the least solution [if. Although this latter dependence might seem to be a problem, 
lower and upper bounds on [if can be easily derived for stochastic models (probabilistic 
programs with procedures, stochastic context-free grammars and back-button processes). 
In particular, this allows us to show that kf depends linearly on the representation size for 
back-button processes. We have also shown by means of an example that the threshold kf 
improves when the number of iterations increases. 

In [16] we left the problem whether DNM converges linearly for non-strongly-connected 
MSPEs open. We have proven that this is the case, although the convergence rate is poorer: 
if h and w are the height and width of the graph of SCCs of /, then there is a threshold 
kf such that kf + i ■ w ■ 2 h+l iterations of DNM compute at least i valid bits of [if, where 
the exponential factor cannot be avoided in general. 

Finally, we have shown that the Jacobian of the whole MSPE is guaranteed to exist, 
whether the MSPE is strongly connected or not. 
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